A Korean Information Retrieval Model Alleviating Syntactic Term Mismatches

نویسندگان

Bo-Hyun Yun

Yong-Jae Kwak

Hae-Chang Rim

چکیده

In Korean information retrieval, term mismatches between indexing terms and query terms are a serious obstacle to the enhancement of retrieval performace. Term matches are not produced because of a space usage of compound nouns and also various representations of a phrase. This paper presents an extended model of Korean information retrieval of alleviating these term mismatches between indexing terms and query terms. In this model, we segments compound nouns into unit nouns by using statistical information and a preference rule. And then we synthesize unit nouns or single nouns into synthesized compound nouns by adjacency constriants. Among all candidates of synthesized compound nouns, we lter out meaningless compound nouns by mutual information and the relative frequency of category pairs. Moreover, we perform similarity computation considering partial matching for compound nouns. The experimental results show that the proposed method can overcome the diierence between surface forms of terms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Glasgow at CLEF 2013: Experiments in eHealth Task 3 with Terrier

In our participation in the CLEF 2013 eHealth task 3, we investigate (1) the effectiveness of our Divergence from Randomness (DFR) framework on retrieving medical webpages, (2) the adoption of classical pseudo-relevance feedback for improving the representation of the queries, and (3) the exploitation of a collection enrichment technique for alleviating the mismatches between the terms in docum...

متن کامل

Using syntactic information in handling natural language queries forextended boolean retrieval model

There are considerable evidences that trained users can achieve a good search eeectiveness through structured boolean queries rather than simple keyword queries because boolean operators can help to make more accurate representations of users' information search needs. However, it is not normally easy for ordinary users to construct eeective boolean queries using appropriate boolean operators. ...

متن کامل

Quasi-Synchronous Dependence Model for Information Retrieval

Incorporating syntactic features in a retrieval model has had very limited success in the past, with the exception of term dependencies. This paper presents a new term dependency modeling approach based on a dependency parsing technique used for both queries and documents. Our model is inspired by a quasi-synchronous stochastic process for machine translation [21]. It describes four different t...

متن کامل

Two-Level Alignment by Words and Phrases Based on Syntactic Information

As a part of work on alignment of the English and Korean parallel corpus, this paper presents a statistical translation model incorporating linguistic knowledge of syntactic and phrasal information for better translations. For this, we propose three models: First, we incorporate syntactic information such as part of speech into the word-based lexical alignment. Based on this model, we propose t...

متن کامل

Applying Multiple Characteristics and Techniques in the NICT Information Retrieval System at NTCIR-6

Our information retrieval system takes advantage of numerous characteristics of information and uses numerous sophisticated techniques. It uses Robertson’s 2-Poisson model and Rocchio’s formula, both of which are known to be effective. Characteristics of newspapers such as locational information are used. We present our application of Fujita’s method, where longer terms are used in retrieval by...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

A Korean Information Retrieval Model Alleviating Syntactic Term Mismatches

نویسندگان

چکیده

منابع مشابه

University of Glasgow at CLEF 2013: Experiments in eHealth Task 3 with Terrier

Using syntactic information in handling natural language queries forextended boolean retrieval model

Quasi-Synchronous Dependence Model for Information Retrieval

Two-Level Alignment by Words and Phrases Based on Syntactic Information

Applying Multiple Characteristics and Techniques in the NICT Information Retrieval System at NTCIR-6

عنوان ژورنال:

اشتراک گذاری